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What is Write Aliocate? 



Write allocate, if enabled, occurs when the processor has a 
pending memory write cycle to a cacheable line and the line 
does not currently reside in the LI data cache. In this case, the 
processor performs a burst read cycle to fetch the cache line 
addressed by the pending write cycle. The data associated with 
the pending write cycle is merged with the recently-allocated 
cache line and stored in the processor’s LI data cache in the 
modified state. The cache line is marked as modified because 
the pending write cycle is not performed on the processor’s 
external bus. 

During the write allocation, a 32-byte burst read cycle is 
executed in place of a non-burst write cycle. While the burst 
read cycle generally takes longer to execute than the write 
cycle, performance gains are realized on subsequent write 
cycle hits to the write-allocated cache line. Due to the nature of 
software, memory accesses tend to occur within proximity of 
each other (principle of locality). The likelihood of additional 
write hits to the write-allocated cache line is high. 
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Programming Details 



The steps required for programming write allocate on KS6™ 
processors are as follows: 

1. Verify write allocate support by using the CPUID 
instruction to check for the correct type and revision of the 
processor. 

2. Configure the MSRs. 

3. Enable write allocate. 



step 1 : 



The first step in supporting the write allocate feature of the 
AMD K86 processors is determining the type of processor. 
Write allocate in the AMD-K5™ processor is supported only on 
Models 1, 2, and 3, with a Stepping of 4 or greater. Write 
allocate in the AMD-K6™ MMX processor is supported only on 
Models with a Stepping of 1 or greater. Use the CPUID 
instruction to determine if the proper model and stepping of 
the processor is present. See the AMD Processor Recognition 
application note, order# 20734 for more information. 

After determining that the processor supports write allocate, 
the next step is to configure the Model-Specific Registers 
(MSR). 

For an AMD-K5 processor Model 1, 2, or 3 with a Stepping of 4 
or greater, go to “Step 2: AMD-K5™ Processor.” For an AMD-K6 
MMX processor with a Stepping of 2 or greater, go to “Step 2: 
AMD-K6™ MMX Processor” on page 8. 



step 2 : AMD-K5™ Processor 

The AMD-K5 processor implements write allocate by providing 
a global write allocate enable bit, three range-protection 
enable bits, and two memory range registers. The global write 
allocate enable bit is accessed using the Hardware 
Configuration Register (HWCR). The memory range registers 
and range enable bits are programmed by read/write MSR 
instructions. 
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The Write Allocate Enable bit (bit 4 of HWCR) should be set to 
0, which prevents potential erroneous behavior in the case of a 
warm boot during write allocate initialization. 

Two MSRs are defined to support write allocate. The MSRs are 
accessed using the RDMSR and WRMSR instructions (see 
“RDMSR and WRMSR” in the AMD-K5™ Processor Software 
Development Guide, order# 20007). The following index values 
in the ECX register access the MSRs: 

■ Write Allocate Top-of-Memory and Control Register 
(WATMCR)— ECX = 85h 

■ Write Allocate Programmable Memory Range Register 
(WAPMRR)— ECX = 86h 

Three non-write-allocatable memory ranges are defined for use 
with the write allocate feature — one fixed range and two 
programmable ranges. 

Fixed Range. The fixed memory range is 000A_0000h- 
000F_FFFFh and can be enabled or disabled. When enabled, 
write allocate can not be performed in this range. 

This region of memory, which includes standard VGA and other 
peripheral and BIOS access, is considered non-cacheable. 
Performing a write allocate in this area can cause compatibility 
problems. It is recommended that this bit be enabled (set to 1) 
to prevent write allocate to this range. Set bit 16 of WATMCR 
to enable protection of this range. 

Programmable Range. One programmable memory range is 
xxxx_0000h-yyyy_FFFFh, where xxxx and yyyy are defined 
using bits 15-0 and bits 31-16 of WAPMRR, respectively. Set 
bit 17 of WATMCR to enable protection of this range. When 
enabled, write allocate can not be performed in this range. 

This programmable memory range exists because a small 
number of uncommon memory-mapped I/O adapters are 
mapped to physical RAM locations. If a card like this exists in 
the system configuration, it is recommended that the BIOS 
program the ‘memory hole’ for the adapter into this 
non-write-allocatable range. 
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Top of Memory. The other programmable memory range is 
defined by the ‘top-of-memory’ field. The top of memory is 
equal to zzzz_0000h, where zzzz is defined using bits 15-0 of 
WATMCR. Addresses above zzzz_0000h are protected from 
write allocate when bit 18 of WATMCR is enabled. 

Once the BIOS determines the size of RAM installed in the 
system, this size should also be used to program the top of 
memory. For example, a system with 32 Mbytes of RAM 
requires that the top-of-memory field be programmed with a 
value of 0200h, which enables protection from write allocate 
for memory above that value. Set bit 18 of WATMCR to enable 
protection of this range. 

Caching and write allocate are generally not performed for the 
memory above the amount of physical RAM in the system. 
Video frame buffers are usually mapped above physical RAM. 
If write allocate were attempted in that memory area, there 
could be performance degradation or compatibility problems. 

Bits 18-16 of WATMCR control the enabling or disabling of the 
three memory ranges as follows: 

■ Bit 18: Top-of -Memory Enable bit 

0 = disabled (default) 

1 = enabled (write allocate can not be performed above Top 
of Memory) 

■ Bit 17: Programmable Range Enable bit 

0 = disabled (default) 

1 = enabled (write allocate can not be performed in this 
range) 

■ Bit 16: Fixed Range Enable bit 

0 = disabled (default) 

1 = enabled (write allocate can not be performed in this 
range) 

Figures 1 and 2 show the bit positions for these two new 
registers. 
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Figure 1 . Write Allocate Top-of-Memory and Control Register (WATMCR)-MSR 85h 
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Figure 2. Write Allocate Programmable Memory Range Register (WAPMRR)-MSR 86h 



Step 3: AMD-K5™ Processor 

All of the write allocate features in the AMD-K5 processor are 
enabled by setting bit 4 (WA) of the HWCR (MSR 83h) to 1. For 
more information on the HWCR, see “Hardware Configuration 
Register” in the AMD-K5^^ Processor Software Development 
Guide, order# 20007. Figure 3 shows the definition of HWCR. 

The BIOS programmer has several options regarding what the 
end-user can control. The BIOS can provide the end-user with a 
setup screen option to enable write allocate. The BIOS can 
provide the end-user with a setup screen option to also setup 
the other features (programmable ranges and fixed range). The 
BIOS can automatically enable and setup the write allocate 
feature and its registers without end-user intervention. This 
automatic setup is recommended. 
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Figure 3. Hardware Configuration Register (HWCR)-MSR 83h 



AMD-K5™ Processor Programming Example for Write Allocate Registers 

The following cases show examples of programming the write 
allocate feature for two types of systems: 

Case 1: For systems without a memory hole and 16 Mbytes of total 

memory: 

■ Program the WATMCR MSR (ECX=85h) with top of 
memory (OlOOh) and enable bits (OOOSh) to protect the fixed 
range and above the top of memory 

• Use the WRMSR instruction and the 64-bit hex value 
0000_0000_0005_0100h 

Note: For 8-Mbyte systems, program 0080h in the lowest 1 6 bits. 
For 32-Mbyte systems, program 0200h in the lowest 16 bits. 

Code Sample: ; disable WA bit (bit 4 of HWCR) 

MOV ECX.83H ;read HWCR (83h) 

RDMSR 

AND EAX.NOT lOH 

WRMSR 

;program top -of -memory and control bits 

MOV ECX,85H ;select WATMCR 

MOV EAX . 50100H ;TME=1 , PRE=0 , FRE=1 , T0M=0100h 

XOR EDX,EDX 

WRMSR 

^enable WA bit 

MOV ECX,83H ;read HWCR (83h) 

RDMSR 

OR EAX,10H ;set bit 4 

WRMSR 
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Case 2: 



1 



Code Sample: 



For systems with a 1 Mbyte memory hole starting at the 15 
Mbyte boundary and 32 Mbytes of total memory: 

■ Program the WAPMRR MSR (ECX=86h) with 15 Mbytes 
(OOFOh) to 16 Mbytes -1 (OOFFh) 

• Use the WRMSR instruction and the 64-bit hex value 
0000_0000_00FF_00F0h 

■ Program the WATMCR MSR (ECX=85h) with top of 
memory (0200h) and all enable bits (0007h) to protect above 
the top of memory, the fixed range, and the programmable 
range 

• Use the WRMSR instruction and the 64-bit hex value 
0000_0000_0007_0200h 

Note: For 8-Mbyte systems, program 0080h in the lowest 1 6 bits. 
For 1 6-Mbyte systems, program 01 OOh in the lowest 1 6 bits. 



;di sabl e 


WA bit (bit 4 


of 


HWCR) 


MOV 


ECX.83H 




;read HWCR (83h) 


RDMSR 








AND 


EAX,N0T lOH 






WRMSR 








; program 


programmabl e 


range to 15-16Mbytes 


MOV 


ECX.86H 




;select WAPMRR 


MOV 


EAX,0FF00F0H 




.•address from FOOOOO to FFFFFF 


XOR 


EDX.EDX 




; cl ear 


WRMSR 








; program 


top of memory 


and 


control bits 


MOV 


ECX,85H 




;select WATMCR 


MOV 


EAX,70200H 




;TME=1 . PRE=1 . FRE=1 . T0M=0200h 


XOR 


EDX.EDX 




; cl ear 


WRMSR 








; enabl e 1 


M bit 






MOV 


ECX.83H 




;read HWCR (83h) 


RDMSR 








OR 


EAX.lOH 




.-set bit 4 


WRMSR 
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Step 2: AMD-K6™ MMX Processor 

The AMD-K6 MMX processor implements write allocate 
differently than the AMD-K5 processor. This section describes 
two programmable mechanisms used by the AMD-K6 processor 
to determine when to perform write allocate. When either of 
these mechanisms indicates that a pending write is to a 
cacheable area of memory, a write allocate is performed. 

Before programming any registers, the BIOS must writeback 
and invalidate the internal cache by using the WBINVD 
instruction. In addition, prior to setting up the write allocate 
registers, the WHCR should be set to 0000_0000_0000_0000h, 
which prevents potential erroneous behavior in the case of a 
warm boot. 

Write Handling Control Register (WHCR). The Write Handling Control 
Register (WHCR) is an MSR that contains three fields — the 
WCDE bit, the Write Allocate Enable Limit (WAELIM) field, 
and the Write Allocate Enable 15-to-l 6-Mbyte (WAE15M) bit 
(See Figure 4). 
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Symbol Description Bits 

WCDE Write Cacheability Detection Enable 8 
WAELIM WriteAllocate Enable Limit 7-1 

WAE15M WriteAllocate Enable 15-to-l 6-Mbyte 0 



Note: Hardware RESET initializes this MSR to all zeros. 



Figure 4. Write Handling Control Register (WHCR)-MSR C000_0082h 



Write Cacheability Detection Enable. Write Cacheability Detection 
causes a write allocate to occur only if the Write Cacheability 
Detection Enable (WCDE) bit (bit 8) in the Write Handling 
Control Register (WHCR) MSR is set to 1. For more details on 
the Write Cacheability Detection Mechanism, see the Cache 
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Organization chapter in the AMD-K6™ MMX Processor Data 
Sheet, order# 20695. 

If the address is cacheable, support of the Write Cacheability 
Detection mechanism requires the system logic to assert KEN 
during a write cycle. Some chipsets assert KEN during a write 
cycle and some chipsets do not assert KEN during a write cycle. 
(Triton chipsets eventually generate a correct value for KEN, 
but not during the sample point. Therefore do not enable 
WCDE in systems that use the Triton chipset.) If Write 
Cacheability Detection is enabled, KEN is sampled during 
write cycles in the same manner it is sampled during read 
cycles (KEN is sampled on the clock edge on which the first 
BRDY or NA of a cycle is sampled asserted). 

It is recommended that the BIOS enable the WCDE feature 
only if it is known that the chipset properly asserts KEN during 
a write cycle. 

Write Allocate Enable Limit. The WAELIM field is 7 bits wide. This 
field, multiplied by 4 Mbytes, defines an upper memory limit. 
Any pending write cycle that addresses memory below this 
limit causes the processor to perform a write allocate. Write 
allocate is disabled for memory accesses at and above this limit 
unless the processor determines a pending write cycle is 
cacheable by means of one of the other Write Cacheability 
Detection mechanisms. The maximum value of this limit is ((2^- 
1) • 4 Mbytes) = 508 Mbytes. When all the bits in this field are 
set to 0, all memory is above this limit and the write allocate 
mechanism is disabled. 

The WAELIM field is similar to the AMD-K5 processor 
top-of-memory field. Once the BIOS determines the amount of 
RAM installed in the system, this number should also be used 
to program the WAELIM field. For example, a system with 32 
Mbytes of RAM would program the WAELIM field with the 
value 0001000b. This value (8), when multiplied by 4 Mbytes, 
yields 32 Mbytes as the write allocate limit. 

Write Allocate Enable 15-to-16-IVIbyte. The WAE15M bit is used to 
enable write allocations for the memory write cycles that 
address the 1 Mbyte of memory between 15 Mbytes and 16 
Mbytes. This bit must be set to 0 to prevent write allocates in 
this memory area. This sub-mechanism of the WAELIM 
provides a memory hole to prevent write allocates. This 



9 




AMDgI 

Implementation of Write Allocate in the K86^^ Processors 



21326A^0-March 1997 



memory hole is provided to account for a small number of 
uncommon memory-mapped I/O adapters that use this 
particular memory address space. If the system contains one of 
these peripherals, the bit should be set to 0. The WAE15M bit is 
ignored if the value in the WAELIM field is set to less than 16 
Mbytes. 

By definition, write allocations in the AMD-K6 are never 
performed in the memory area between 640 Kbytes and 1 
Mbyte. It is not safe to perform write allocations between 640 
Kbytes and 1 Mbyte (000A_0000h to 000F_FFFFh) because it is 
considered a non-cacheable region of memory. 



step 3: AMD-K6"* MMX Processor 

The BIOS programmer has several options regarding what the 
end-user can control. The BIOS can provide the end-user with a 
setup screen option to enable write allocate. The BIOS can 
provide the end-user with a setup screen option for the other 
features (Write Cacheability Detection, Write Allocate Enable 
Limit, and Write Allocate Enable 15-to-16-Mbyte). The BIOS 
can also automatically enable and setup the write allocate 
feature and its registers without end-user intervention. This 
automatic setup is recommended. To disable all write allocate 
features for the AMD-K6 processor, the WHCR must be set to 
0000_0000_0000_0000h — the default value at boot-up. 

AMD-K6™ MMX Processor Programming Example for Write Allocate 
Registers 

The following cases show examples of programming the write 
allocate feature for two types of systems: 

Case 1: For systems that contain chipsets that do not properly support 

KEN on write cycles, have a 1 Mbyte memory hole starting at 
the 15 Mbyte boundary, and 32 Mbytes of total memory: 

■ Program the WHCR MSR (ECX=C000_0082h) with 
WCDE=0, WAELIM=8, and WAE15M=0 

• Use the WRMSR instruction and the 64-bit hex value 
0000_0000_0000_0010h 



10 




21326A/0-March 1997 



AMPgl 

Implementation of Write Allocate in the K86™ Processors 



Code Sample: 



;flush cache 

PUSHF 

CLI 

WBINVD 

;set Write Allocate Limit 

MOV ECX.0C0000082H 

MOV EAX,10H 

XOR EDX.EDX 

WRMSR 

POPE 



;save state 
;disable interrupts 
;write back and invalidate cache 
and clear WAE15M bit 

;WCDE=0.WAELIM=8.WAE15M=0 



;restore original state 



Case 2: For systems with chipsets that properly support KEN on write 

cycles, do not have a memory hole, and have 16 Mbytes of total 
memory: 

■ Program the WHCR MSR (ECX=C000_0082h) with 
WCDE=1, WAELIM=4, and WAE15M=1 

• Use the WRMSR instruction and the 64-bit hex value 
0000_0000_0000_0109h 



Code Sample: 



;flush cache 

PUSHF 

CLI 

WBINVD 



isave state 

^disable interrupts 

;write back and invalidate cache 



;set Write Allocate Limit and set WAE15M bit 



MOV 


ECX.0C0000082H 




MOV 


EAX.109H 


;WCDE=1,WAELIM=4,WAE15M=1 


XOR 


EDX.EDX 




WRMSR 






POPF 




;restore original state 
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